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Abstract. Smoothing methods have become part of the standard tool set 
for the study and solution of nondifferentiable and constrained optimization 
problems as well as a range of other variational and equilibrium problems. In 
this note we synthesize and extend recent results due to Beck and Teboulle on 
infimal convolution smoothing for convex functions with those of X. Chen on 
gradient consistency for nonconvex functions. We use epi-convergence tech- 
niques to define a notion of epi-smoothing that allows us to tap into the rich 
variational structure of the subdiffercntial calculus for nonsmooth, nonconvex, 
and nonfinite-valued functions. As an illustration of the versatility and range 
of epi-smoothing techniques, the results are applied to the general constrained 
optimization for which nonlinear programming is a special case. 



1 Introduction 

A standard approach to solving nonsmooth and constrained optimization prob- 
lems is to solve a related sequence of unconstrained smooth approximations [7j |8j 
[9] [21] 29, 33, 37, 48, 53 . The approximations are constructed so that cluster points 
of the solutions or stationary points of the approximating smooth problems are 
solutions or stationary points for the limiting nonsmooth or constrained optimiza- 
tion problem. In the setting of convex programming, there is now great interest 
in these methods in the very large-scale setting (e.g., see [26] [44] [4Sj [49]), where 
first-order methods for convex nonsmooth optimization have been very success- 
ful. At the same time, there are many recent applications of smoothing methods 
to general nonlinear programming, equilibrium, and mathematical programs with 
equilibrium constraints, e.g., see[TOl[THl[Il|2Dll21|231|3IJ|31l3Sll^]- This paper 
is concerned with synthesizing and expanding the ideas presented in two important 
recent papers on smoothing. The first is by Beck and Teboulle [7] which develops 
a smoothing framework for nonsmooth convex functions based on infimal convo- 
lution. The second is by Chen |21j which, among other things, studies the notion 
of gradient consistency for smoothing sequences. Our goal is to extend the ideas 
presented in [7J for convex functions to the class of convex composite functions and 
provide conditions under which this extension preserves the gradient consistency. 
Our primary tool in this analysis is the notion of variational convergence called 
epi-convergence [4] [53]. Epi-convergence is ideally suited to the study of the 
variational properties of parametrized families of functions allowing, for example, 
the development of a calculus of smoothing functions which is essential for the 
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applications to the nonlinear inverse problems that we have in mind [1, 2, 3|. Epi- 
smoothing is a weaker notion of smoothing than those considered in [7J Definition 
2.1] where complexity results are one of the key contributions [7] Theorem 3.1]. It 
is the complexity results that require stronger notions of smoothing. On the other 
hand, our goal is to establish limiting variational properties in nonconvex applica- 
tions, in particular, gradient consistency (see |21[ Theorem 1] and |15[ Theorem 
4.5]). 

We begin in Section [5] by introducing the notions of cpigraphical and set- valued 
convergence upon which our analysis rests. We also introduce the tools from subd- 
ifferential calculus [53] that we use to establish gradient consistency. In Section [3] 
we define epi- smoothing functions and develop a calculus for these smoothing func- 
tions that includes basic arithmetic operations as well as composition. In Section 
SJ we give conditions under which the Beck and Teboulle [7] approach to smoothing 
via infimal convolution also gives rise to epi-smoothing functions that satisfy gradi- 
ent consistency. These results are then applied to Moreau envelopes (e.g., see [53]) 
and extended piecewise linear- quadratic functions. In Section [5] we introduce con- 
vex composite functions an give conditions under which the epi-smoothing results 
of Section|4]can be extended to this class of functions. In Section[6l we conclude by 
applying the smoothing results for convex composite functions to general nonlinear 
programming problems. 

Notation: Most of the notation used is standard. An element x £ R n is un- 
derstood as a column vector, and K := [-co, +oo] is the extended real-line. The 
space of all real m x n-matrices is denoted by R mx ™, and for A € R mx ™, A T is its 
transpose. The null space of A is the set 

mil A := {x e 1" Ax = 0}. 

By I n xn we mean the n x n identity matrix and by ones(n, m) the n x m matrix 
each of whose entries is the number 1. 

Unless otherwise stated, || • || denotes the Euclidean norm on R™ and denotes 
the l-norm. If C C R ra is nonempty and closed, the Euclidean distance function for 
C is given by 

dist(y | C):= inf||y- z\\ . (1) 

When C is convex it is easily established that the distance function is a convex 
function, and the optimization ([l]) has a unique solution Tlc(y) which is called the 
projection of y onto C. 

For a sequence {x k } C R" and a (nonempty) set X C R™ we abbreviate the fact 
that x k converges to x € R" and x k S X for all k E N by 

x k -^x x. 

Moreover, for a function / : R n — > R, define 

x k -> x and f(x k ) -> f(x). 

This type of convergence coincides with ordinary convergence when / is continuous. 
For a real-valued function / : R™ — > R diffcrcntiable at x, the gradient is given by 
V/(x) which is understood as a column vector. For a function F : R" — > R m 
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diffcrentiable at x, the Jacobian of F at x is denoted by F'(x), i.e., 

/ VF^xf \ 
F'(;r) = : e R mx ". 

V VF m (x) T J 

In order to distinguish between single- and set- valued maps, we write S : R ra =4 R m 
to indicate that S maps vectors from 1" to subsets of R m . The graph of S is the 
set 

gphS := {(x,y) | y G SO)}, 
which is equivalent to the classical notion when S is single-valued. 

2 Preliminaries 

In this section we review certain concepts from variational and nonsmooth analysis 
employed in the subsequent analysis. The notation is primarily based on |53j . 
For an extended real-valued function / : R™ ->1U {+00} its epigraph is given by 

epi/ := {(x,a) 6 1" x R f(x) < a}, 

and its domain is the set 

dom/ := {x G K" | f(x) < +00}. 

The notion of the epigraph allows for very handy definitions of a number of prop- 
erties for extended real- valued functions (see (HJ [52l [53] ) . 

Definition 2.1 (Closed, proper, convex functions). A function f : R n — > RU{+oo} 
is called lower semicontinuous (lsc) ( or closed ) if epi f is a closed set. f is called 
convex if epi f is a convex set. A convex function f is said to be proper if there 
exists x G dom / such that f(x) G M. 

Note that these definitions coincide with the usual concepts for ordinary real- valued 
functions. Moreover, it holds that a convex function is always (locally Lipschitz) 
continuous on the (relative) interior of its domain 52, Theorem 10.4]. 
Furthermore, we point out that, in what follows, for an lsc, convex function / : 
W l — >IRU{+oo}, we always exclude the case / = +00, which means that we deal 
with proper functions. 

An important function in this context is the (convex) indicator function of a set 
C C R" given by 5(- \ C) : R" -s- R U {+00} with 

S(x I C) = ( ° ^ X 
y 1 ' y +00 if x f C. 

The indicator function 6(- | C) is convex if and only if C is convex, and S(- | C) is 
lsc if and only if C is closed. 

A crucial role in our upcoming analysis is played by the concept of epi- convergence, 
which is now formally defined. 

Definition 2.2 (Epi-convergence). We say that a sequence {fk} of functions fk ■ 
R" I epi-converges to f : R" ->■ I if 

Lim epi f k = epi /, 

k— >-oc 
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where a Painleve-Kuratowski notion of set-convergence as given by [53J Definition 

4.1] is employed. 

In this case we write 

e- lim f k = f or f k A /. 

Epi-convergence for sequences of convex functions goes back to Wijsman |58 [ 159 ) . 
where it is called infimal convergence. The term epi-convergence arguably is due 
to Wets [57]. 

A handy characterization of epi-convergence is given by 

/*; 4- / <=► VxeR«{lji\^* : f m[ni Vfl-J^-l (2) 

JK \3{x k }->x: hmsupf k (x h )<f(x), w 

see |53[ Proposition 7.2], which we invoke in several places. For extensive surveys 
of epi-convergence we refer the reader to [4] or [53] Chapter 7]. 

We make use of the regular and limiting subdifferentials to describe the variational 
behavior of nonsmooth functions. In constructing the limiting subdifferential, we 
employ the outer limit for a set-valued mapping, which we now define along with 
the inner limit: 

For S : E™ =4 R m and X C R" the outer limit of S at x relative to X is given by 
LimsupS(ar) := {v | 3{x k } -> x x, {v k } -> v : u fe 6 S(a; fe ) Vfc e N} 

and the inner limit of 5 at a; relative to X is defined by 

Liminf STsc) := {u | V{x fc } ^ x x, 3{v k } -> v : v k £ S{x k ) Vfc £ N). 

We say that S is outer semicontinuous (osc) at ir relative to X if 

Lim sup 5(2;) C 5(x). 

In case that outer and inner limit coincide, we write 

Lim _S(x) :— Lim sup S(x), 

and say that S is contiuous at x relative to X. 

Definition 2.3 (Regular and limiting subdifferential). Let f : K" -> lU {+00} 
and x G dom/. 

a) The regular subdifferential of f at x is the set given by 

df(x) := {v I f(x) > f(x) + v T (x -x)+ o(\\x - x\\)} . 

b) The limiting subdifferential of f at x is the set given by 

df(x) :— Limsup<9/(a;). 

There are other ways to obtain the limiting subdifferential than the one described 
above, which goes back to Mordukhovich, e.g., cf. [45] . See [17] or [43] for a con- 
struction of the limiting subdifferential via Dini-derivatives. 

It is a well-known fact, see (53] Proposition 8.12], that if / : R" — > R U {+oc} is 
convex, both the limiting and the regular subdifferential coincide with the subdif- 
ferential of convex analysis, i.e., 

df(x) = {v I f(x) > f(x) + v T (x - x) VxeK™} =<9/(x) Viedom/. 
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The above subdiffereixtials are closely tied to normal cones, in fact the regular and 
the limiting normal cone, see [53] Definition 6.3], of a closed set C C R™ at x € C 
can be expressed as 

N(x | C) = dS(x | C) and N(x | C) = 88{x \ C), 

see [S3] Exercise 8.14]. 

An important concept in the context of subdiffcrcntiation is (subdifferential) regu- 
larity. We say that / : K™ — > R U {+00} is (subdifferentially) regular at x G dom/ 
if 

N((x,f(x)) I epi/) = N((x,f(x)) I epi/). 

Note that this regularity notion coincides with the one used in [53] , see the discus- 
sion on page 61 in [24] in combination with [53] Corollary 6.29]. 

3 Epi-Smoothing Functions 

In this section we lay out the general framework for the smoothing functions studied 
in this paper. Let / : 1" -> 1 U {+00} be lsc. We say s/ : R" x R + -> R is an 
epi- smoothing function for / if the following two conditions are satisfied: 

(i) Sf(-,fik) epi-converges to / for all {/ife} -I 0, written 

e- lims/ (•,//) =/, (3) 
fi 4.0 

(ii) Sf(-,fi) is continuously differentiable for all /1 > 0. 

Note that ([3]) is always fulfilled, see [53j Theorem 7.11], under the following condi- 
tion 

lim s f (x,fi) = f(x) Viet", (4) 

which is called continuous convergence in [53] , As we will see in Section [4] however, 
continuous convergence can be an excessively strong assumption, especially when 
dealing with non-finite valued functions. 

The following result provides an elementary calculus for epi-smoothing functions. 

Proposition 3.1. Let g,h : W a — > R U {+00} be lsc and let s g and Sh be epi- 
smoothing functions for g and h, respectively. 

a) If s g converges continuously to g, then s/ := s g + Sh is an epi-smoothing 
function for f := g + h. 

b) If g is continuously differentiable, then s/ := g + Sh is an epi-smoothing 
function f := g + h. 

c) If X > 0, then Xs g is an epi-smoothing function for Xg. 

d) If A e R mxn has rank m and b e R"\ then s g (-, ■) := s g (A(-) + b, ■) is an 
epi-smoothing function for f := g{A{-) + b). 

Proof. Item a) follows from [53] Theorem 7.46], while b) follows from a) and the 
fact that g is a continuously convergent epi-smoothing function for itself. Item c) is 
provided by [531 Exercise 7.8 d)]. Item d) is an immediate consequence of Theorem 
13.21 and the discussion up front. □ 

To obtain a more powerful chain rule than the one given in item d) above, we 
need to invoke more refined tools from variational analysis. One such tool is metric 
regularity (e.g., see [TT] [47], [53] ) , originally defined for set-valued mappings. For a 
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single- valued mapping F : l n — >■ M. m we say that F is metrically regular at x G R n 
if there exists 7 > and neighborhoods of x and V of F(x) such that 

dist(x, F~\y)) < j\\F(x)-y\\ Vx G W,y G V. 

We say that F is metrically regular, if it is metrically regular at every x G W l . In 
particular, i* 1 is metrically regular if it is a locally Lipschitz homeomorphism (e.g., 
see [53l Corollary 9.55]). Mordukhovich has shown that metric regularity can be 
fully characterized via the coderivative criterion, e.g., see 47, 53]. In the case of a 
single- valued, continuously diffcrcntiablc map F : K™ — > K m the coderivative crite- 
rion reduces to the condition that rankf (x) = m, that is, 

F is metrically regular at x <S=> rankf (x) = m. 



Theorem 3.2. Let g : R m ->KU {+00} and let s g be an epi- smoothing function 
for g. Furthermore, let F : K™ — > R m be continuously differentiable and metrically 
regular. Then st := s g (F(-), •) is an epi-smoothing function for f :— g o F . 

Proof. The smoothness properties are obvious from the assumptions. Next, let 
4. be given and put gu ■= s s (-,/ifc) and fk ■— gu F. We need to show that 
fk — > f ■ For this purpose, we invoke the characterization of epi-convergence as 
provided by ([2]). To this end, let x G W 1 and {x k } — > x be given. Then it follows 
from the fact that gu A g and ([2]) that 

liminf f k (x k ) = liminf g k {F(x k )) > g{F(x)) = f(x). (5) 

k k 

Moreover, as g k A g, © yields a sequence {y k } y := F(x) such that 

limsupg fc (?/ fc ) < 

Since F is metrically regular at x, we obtain a sequence {x k } — > x such that 
F(x k ) = y k for all k G N. This, in turn, gives 

limsup/ fc (x' £ ) = limsup5fc(y fe ) > y = /(x). 
fc fc 

This, together with (0 proves ([2]) for ff. with respect to /, and this concludes the 
proof. □ 

Although epi-convergence is arguably a mild condition, it still provides desirable 
convergence behavior for minimization in the following sense: 

Theorem 3.3. [531 Theorem 7.33] Suppose the sequence {fk} is eventually level- 
bounded (see [53l p. 266]j, and /fc A / with f k and f Isc and proper. Then 

inf/fe-^inf/ (finite). 

Now, suppose a numerical algorithm produces sequences {x fc } — > x and {[ik] I 
such that 

lim \7 x s f (x k ,fi k ) -> 0. 

k—toc 
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A natural question to ask in this context is whether a; is a critical point of / in the 
sense that £ df(x). A sufficient condition is, clearly, provided by 

Limsup \7 x Sf(x, /i) C df(x). 

a:— > re, /a 4,0 

The next result shows that the converse inclusion is always valid if s/(-, fx) f. 

Lemma 3.4. Let f : M™ ->RU {+00} be Isc and st an epi- smoothing function for 
f. Then for x £ dom/ we have 

df(x) C Limsup V x Sf (x, 

x— >x,^4-0 

Proof. Let v £ df{x) be given. Since by assumption e— lim^o s/(-, /i) = / we may 
invoke (53 Corollary 8.47] in order to obtain sequences {/ifc} I 0, {x k } — > x and 
{v k } with v k £ d x Sf(x k ,fik) such that v k — > v. Now, since s/(-,/ifc) is continuously 
differentiable by assumption, we have 

V k = Vxf(x k ,fJ-k), 

which identifies v as an element of Lim sup^^g ^ V x s/(a;, /i) and thus, the asser- 
tion follows. □ 

A major contribution of this paper is the construction of smoothing functions having 
the property that 

Limsup V x Sf(x,/j,) = df(x) (6) 
x— >s,/i4.o 

at any point x £ dom/. This condition implies the notion of gradient consistency 
defined in [2TJ Equation (4)] which is obtained by taking the convex hull on both 
sides of this equation. However, since all of the functions we consider are subdiffer- 
cntially regular, Lemma 13.41 implies that ^ is equivalent to gradient consistency. 



4 Epi-Smoothing via Infimal Convolution 

In this section we show that the class of smoothing functions for nonsmooth, convex 
and lsc functions introduced in [7\ fits into the framework layed out in Section [3] 
As a by-product, we show that Moreau envelopes fulfill the requirements of our 
smoothing setup. 

The approach taken in [7, is based on infimal convolution [6l HTJ |42j [52j [53] . Given 
two (extended real- valued) functions fi, f% ■ M n — > K the inf- convolution (or epi- 
sum, see Lemma [4.21 b) in this context) is the function /i#/2 : K™ — > K defined 

by 

(fM)(x):= inf {h(u) + f 2 (x-u)}. 

In what follows we assume that 

(A) g : R n — > K U {+00} is proper, lsc, and convex, and 

(B) u> : M. n — >• M is convex and continuously differentiable with Lipschitz gradi- 
ent. 

Moreover, for (x > 0, define the function uj^ : R" H»MU{+oo}by 
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Obviously, uj^ is also convex and continuously differentiable with Lipschitz gradient. 
In [7], the authors consider the (convex) function 

(fl#w^)(x) = inf \g(u) + l xw(^^-) } (jt>0) 

as a smoothing function for g. We now investigate conditions on u) for which the 
inf-convolution gftu)^ serves as an epi-smoothing function in the sense of Section [3] 
In this context, the notion of coercivity plays a key role where it arises as a natural 
assumption on the function uj. Several different notions of coercivity occur in the 
literature. We now define those useful to our study. 

Definition 4.1 (Coercive functions). Let f : R" — > K U {+00} be Isc and convex. 

a) / is called 0-coercive if 

lim f(x) = +00. 

|M|->oo 

b) / is called 1-coercive if 

/(*) , „ 



lim 



||x||->-oo ||a;|| 

The first result establishes important properties of the function g^to^. 

Lemma 4.2. If uj is 1-coercive (or 0-coercive and g bounded from below) the fol- 
lowing holds: 

a) g^uJp, is finite-valued, i.e., gftuj^ : R ra — > R, and for all x G R™ we have 



{g#uj^){x) = min lg{u) + /xw(- — -) } 
.{ 5 ( u ) + /iW (^-^)} ^0. 



argmm ■ 

tiGR™ L v M 

b) We have 

epi = epi g + epi w M 

c) g#uJfj, is continuously differentiable with 

■x - Ufj,(x) 



VQ?#^)(z) = V^(^p) = Va^z - u^x)) Vx G 
where u^(x) £ argmin ueR „ |.g(u) + /zwf^)}. 



Proof. The assertion that 

(gftu^x) < +00 Vie R n 

is due to the fact that a; is finite- valued and g ^ +00. Moreover, w M obviously 
inherits the respective coercivity properties from uj. Hence, the remainder of a) 
follows immediately from [6l Proposition 12.14]. 
In turn, b) follows from a) and j6[ Proposition 12.8 (ii)]. 

Item c) is an immediate consequence of a) together with [7J Theorem 4.2 (c)]. □ 

The following auxiliary result, which is key for establishing epigraphical limit be- 
havior of states that the epigraphical limit of w M for n J. is <$(■ | {0}) if and 
only if u is 1-coercive. 
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Lemma 4.3. u) is 1- coercive if and only if 

e— limw M = 5(- | {0}). 

Proof. First, let lu be 1-coercive: 

We start by showing that Lim sup M ^ epi cj m C {0} x R + = epi<5(- | {0}). 

To this end, let (z,a) £ Lim sup M ^ epi . Then there exist sequences {z k } —> z, 

{ctk} — > ol and {/ifc} I such that 



k 

(—) <a k Vfc e N. (7) 

\Ufc/ 



This can be written as 



'^k 



J^\<^1 vfcGN. 
V /it- / 



It is immediately clear from this representation, that a > 0, since otherwise the 
right-hand side would tend to -co, while the left-hand side remains either conver- 
gent on a subsequence (if {^-} is bounded) or tends to +oo (if {f^} is unbounded). 
Now, suppose that z ^ 0. Then {f^} is unbounded and (J7|) can be rewritten as 

I^2<^ vfcEN. 



By the 1-coercivity of lu the left-hand side tends to +00, while the right-hand side 
is bounded, which is a contradiction. Hence, we have proven that z = and a > 0, 
which shows that, in fact, Lim sup^ epi w M C {0} x R + . 

We now show that Lim inf ^0 epi lu^ 2 {0} x R+. For these purposes, let a > 
and {/ifc} 4. be given. Then choose z k := and ctk := a + ^fcw(O) > uj^ k {z k ). 
Then (z fc ,a/c) G epiw Mfc for all fc G N and (z , ctk) (0, a). This shows that 
Liminf^oepiwp 3{0}xl + . 

Putting together all the pieces of information, we see that 

Lim cpiwp = epi<5(- | {0}), 



i.e., 



e— limo; M = <$(• | {0}). 



Now, suppose that cj is not 1-coercive. Then there exists an unbounded sequence 
{x h } such that either 

or {^ipr^} is bounded. Put ^ fe := j^q -> 0. Then 
and we have 

k 
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If "^.|| — > — oo, we infer that uj Uk does not converge epigraphically at all (in 
particular not to 5(- | {0})) from @, since we have lim inf k^oo uj fik (]|§F|j-) — > — oo. 
In case that { } * s bounded, we may assume w.l.g. that 



for some w£i Then we infer from ([5} that 

(x,ui) £ Lim sup epi 

k— »oc 

with x 7^ being an accumulation point of {nfiqj'V But (x,Q) £ epi<5(- | {0}), 
which concludes the proof. □ 

The following lemma establishes monotonicity properties for the family of functions 
gftuj^, which come into play in Section [5] 

Lemma 4.4. 7/w(0) < 0, then for all x £ K n the function /i i— > (g#u^)(x) is 
nondecreasing on K++ and bounded by g(x) from above. 

Proof. Let y £ M. n . Then for ^ > /i 2 > we have 



'(f) 



= w( 7*J!_ + (i_£» 



Ml M2 v Ml 

< ^m + fi-^Uo) 



Mi V M2 7 v Mi 

. M2 / y 
< — wl — 

Mi V M2 



Multiplying by /ii yields 



w Ml (»)<w^(y) Vj/GE n , 
and hence for a: £ W 1 arbitrarily given, we have 

g{u) + u; Ml {x — u) < g(u) + u>^ 2 (x — u) Vu € K n . 
Taking the infimum over all u £ W 1 gives 

which concludes the proof due the choice of fj,\ and \xi. □ 

The following result establishes the desired epi-convergence properties of the inf- 
convolutions. Note that, to our knowledge, we cannot deduce it from known results 
such as [531 Proposition 7.56] or [5J Theorem 4.2], since our assumptions do not 
meet the requirements for the application of these results. In particular, we do not 
assume g to be bounded from below. 

Proposition 4.5. If u) is 1-coercive, then 

e-limg#o; M = g. 

fj.i.0 
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Proof. The fact that Lim inf ^4,0 epi gi^oj^ 2 epi<? follows immediately from |53[ 
Theorem 4.29 a)] when applied to the respective epigraphs. 
Therefore, it is enough to show that Lim sup^g epi gftuj^ C epig. 
To this end, pick (x,a) G Lim sup^ epi arbitrarily. Then there exist se- 

quences {/Xfc} 4- 0, {x k } — > x and — > a such that 

(,g#^ fc )(x fe ) <a k VfcGN. (9) 

With 

u k G argmin (ff (tt) + /ifeW ( -J } , 

uSR" 1 v Mfc 7 ' 

© can be written as 

k k 

g(u k ) + n k iu( X ~ U ) <a k Vfc G N. (10) 

Using the fact, cf. [6] Theorem 9.19], that the convex, lsc function g is minorized 
by an affine function, say x 1— > fe T .T + /3, this leads to 



+ + )<a k VfcGN. 

V Uk ' 



b T u k 



If we assume that {u k } does not convergence to x, we can rewrite this (for fc 
sufficiently large) as 



-'I " I ^ q fc _ j/zy _ p 



11 Mfc 11 

Whether {it*} is unbounded or not, we obtain a contradiction, since the left-hand 
side tends to +00, as lj is 1-coercive, while the right-hand side remains bounded. 

Hence, {u k } x. We now claim that g{u k ) +00, and hence, in particular, 
x G domg. If this were not the case, we invoke [5J Theorem 9.19] again to get an 
affine minorant of w, say x 1— > c T ir + 7, and infer from (jlOl) that 

g(u fe ) + c T (u fe - a; fe ) + /j,kl < a k Vfc G N. 

This, however, leads to a contradiction if g(u fe ) — > +00 since c T (u k — x k ) + fXkl 
and afc — >■ a < +00. Thus, we have shown that {g(u k )} is bounded from above. 
Since g is lsc and u k — > x, we also know that liminf/j^oo g(u k ) > g(x). Hence, we 
may as well assume that g(u k ) — >• g > g{x) and, in particular, we have x G doing. 
We now infer from (1101) that 



{x k - u k ,a k - g(u k )) G epiw Mfc Vfc G N. 
Since x k and a k — g{u k ) —> a — g, Lemma H751 implies 

(0, a — g) G Limsup epiw M c epi 5(- | {0}). 

This immediately gives 

g(x) <g<a, 

i.e., (x,a) G epig, which concludes the proof. □ 
We are now in a position to state the main result of this section. 
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Theorem 4.6. If lu is 1-coercive then the function s g : (x, fx) i-> {g4^u)p){x) is an 
epi- smoothing function for g with 



gphV x s ff (-,/z) -> gphdg, 



and hence, in particular, 



Limsup V x s g (x,fx) — dg(x) Vi G domj. 

fi^O, x—>x 

Proof. Due to Propostion 14. 5[ we have e— lim^o s g (-, /i) = c— lim^o <?#w A , = g. 
The smoothness properties of V x s g (-,fi) — Vgf/^ut^ follow from Lemma [4.21 The 
remaining assertion is an immediate consequence of Attouch's Theorem, see |53[ 
Theorem 12.35]. This concludes the proof. □ 

Moreau Envelopes The most prominent choice for ui is given by 

w:=-||-|| 2 . 

2 II ll 

The resulting inf-convolution of w M with an lsc function g : W l -> RU is 
called the Moreau envelope or Moreau- Yosida regularization of g and is denoted by 
e^g, i.e., 

e^g{x) = inf {g(w) + —\\w - x|j 2 }. 
The set-valued map P^g : K" K™ given by 

p ^9(x) ■= argmin {g(w) + ^-\\w - x\\ 2 } 
is called the proximal mapping for g. 

The following properties of Moreau envelopes and proximal mappings of convex 
functions are well known, see [521 [53] or EH . 



Proposition 4.7. Let f : R™ -^RU {+00} be lsc and convex and /x > 0. Then the 
following holds: 

a) P^f is single-valued and Lipschitz continuous. 

b) e^f is convex and smooth with Lipschitz gradient Ve M / given by 

Ve M /(as) = -[x-P^fix)]. 

c) argmin / = argmin e M /. 

In view of item c) it is possible to recover the minimzers of a (possibly nonsmooth) 
convex function by those of its Moreau envelope. Hence, it is not even necessary to 
drive the smoothing parameter to zero. 

Since the function x \-> |||a;|| 2 is 1-coercive, the following result can be formulated 
as a corollary of Theorem 14.61 



Corollary 4.8. Let g : W l — > K U {+00} be lsc and convex. Then s g : (x, fi) i-)- 
e^g{x) is an epi- smoothing function for g with 

Limsup V x s g (x, fi) = dg(x) \/x G domg. 

/j^O, x—^x 

When g is lsc and convex, the fact that e^g epi-converges to g as [i J, is well 
known (cf. the discussion in [53] after Proposition 7.4). 
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Extended Piecewise Linear- Quadratic Functions (EPLQ) [53] EPLQ func- 
tions play a key role in a wide variety of applications, e.g., signal denoising [25| , l26 j . 
model selection [55 , compressed sensing [271 1281 [3"8] . robust statistics [40] . Kalman 
filtering [TJ [2] [32], and support vector classifiers [30l [51] [54]. Examples include 
arbitrary gauge functionals [53] (e.g., norms), the Huber penalty [7] [40], the /imge 
Zoss function [30, 51, 54 , and the Vapnik penalty 'A'.) 571] . For an overview of these 
functions and their statistical properties see [3, 53 . In this section, we show that 
the Moreau envelope mapping g M- e^g maps the class of EPLQ functions to itself 
in a very natural way. 

Definition 4.9. The convex function g : M n — > M. is said to be extended piecewise 
linear- quadratic if for some positive integer m there exists a nonempty closed convex 
set U C K m (typically polyhedral), an injective matrix R G M. nxm , a symmetric and 
positive semi-definite matrix B G r™x™^ an g a vec i or g jjm suc /j 

:= 9 {UB R b) (x) := sup (u, Rx-b) - -u T Bu. (11) 

If m = n, R — I, and b — 0, then g is said to be piecewise linear-quadratic (PLQ). 

Example 4.10 (Examples of EPLQ functions). 

(1) Norms: Let \\-\\^ be a norm, with closed unit ball IB*. Then \\'\\% — ^(b°,o,j,o)j 
where B° := {v \ (v, u) < 1 Vu G B* }. 

(2) The Huber penalty: Let k > 0. Then j^oj *s ffoe Huber penalty with 
threshold k. 

(3) TTie Vapnik penalty: Let e > and define U — [0, l] 2n ,i? = [I n xm -4xJ T , 
and b = e ones(2n, 1), i/ie 71 9(u,o,T,b) is the Vapnik penalty with threshold e. 

Proposition 4.11. Let 6 {U B R b) be an extended piecewise linear- quadratic function. 
If B is positive definite or U is bounded, then 

where B = B + /iRR T . Moreover, for each x G l n t/iere exists a saddle-point 
(u, v) G U X M™ /or £/ie closed proper concave-convex saddle-function [52| Section 
33] 

isT(u, u) := -b,u)- \u T Bu + -L ||.t - u|| 2 - <5(u | U) 
satisfying e^g{x) = K(u,v). 

Proof. Regardless of the choice of x, K is coercive in v for each u G U, and if B is 
positive definite or U is bounded, then —K is coercive in u for each v G K". Hence, 
by [52] Theorem 37.6], for every x G R™, K has a saddle-point (u~,v) G U X R" 
satisfying 

e fl g(x) — inf sup if (m, u) 

= ^(w,^) 

sup ini 

uet/ veM 

To complete the proof observe that the problem 



= sup inf K(u,v). 



inf K(u, v) = — 



(b, u) + ^u T Bu 



inf 



(v, R T u) + j-\\x-v\\ 2 



14 



JAMES V. BURKE AND TIM HOHEISEL 



has a unique solution at v(x, u) = x — /j,R T u. Plugging this solution into K gives 
e pi g{x) = sup ueU K(u,v(x,u)) = 9 (u ,B, R ,b){x). □ 

Example 4.12 (Lasso-Problem). Given A e R mxn and b e R m with m « n, 
consider the nonsmooth optimization problem 

mmf(x):=hAx-bf + \\\x\\ 1 , (12) 

x Z 

where A > 0. This problem is known in the literature as the Lasso-Problem, see 

The objective function f is the sum of two convex functions, one is smooth and 
the other is a nonsmooth PLQ function. By Proposition \3.1l an epi- smoothing 
function for f can be obtained by computing the Moreau envelope for the 1-norm. 
This envelope is obtained from the proximal mapping which in this case is commonly 
referred to in the literature as soft thresholding [25, 26 . An easy computation shows 
that 

!x t + fi if Xi < -(j,, 
x, - (i if Xi > (J,, 
if \xi\<[i. 

5 Convex Composite Functions 

An important and powerful class of nonsmooth, nonconvex functions / : R™ — > 
K U {+00} is given by 

f(x):=g(H(x)) Vx e R", (13) 

where g : W n — > M.U{+oo} is lsc and convex and H : R™ — > R m (twice) continuously 
differentiable. These functions go by the name convex composite, see, e.g., |11[ 112] 
or [IB], and are closely related to amenable functions, see [S31 Definition 10.32]. 
Suppose one has an epi-smoothing function s g of g, then it is a natural question to 
ask whether s/(-, •) := s g (H(-), ■) is an epi-smoothing function of /. That is, do the 
smoothing properties of s g (with respect to g) carry over to smoothing properties 
of Sf (with respect to /)? In particular, does the epi- convergence of s g (-,fi) to g 
imply the epi- convergence of s/(-, /u) to / ? To clarify this connection, we start with 
an easy observation for which we give a self-contained proof (an alternative proof 
can be obtained by applying [53l Formula 4(8)] to the respective epigraphs and the 
function F(x,a) := (H(x),a) satisfying epi / = F _1 (epi<7)). 

Lemma 5.1. Let s g be an epi-smoothing function for g, and define s/(-, •) := 
s g (H(-),-). Then 

Lim sup epi s / (•, //) C epi /. 

Proof. Let (x,a) € Limsup^Q epi s/(-, fi). Then there exist sequences {x k } — > 
x, {ctk\ — > a and {^k} i such that 

s g (H(x k ),(i k ) <a k VfceN, 

i.e., 

(H(x k ),a k ) Gepis ff (-,/i fe ) V/c e N. 
Since (H(x k ), a k ) — > (H(x), a) we get from the epi-convergence of s g (-, fi) to g that 

(H(x),a) G epi g, 
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which immediately yields 

(x,a) £ epi/. 

This proves the result. □ 

We point out that in the previous result, as well as in the following two results, 
only continuity of H and no smoothness assumption is needed. 

Proposition 5.2. Let s g be an epi- smoothing junction for g such that s g (y,fi) is 
nondecreasing as /j 1 for all y £ R m . Then for s/(-, •) := s g (H(-), •) we have 

c - liln s/(-,^) = /■ 

Proof. Due to Lemma |5~T1 it suffices to show that 

Liminf epi s/(-,/x) D epi /. 

To this end, let (x,a) £ epi/, i.e., g(H(x)) < a. Now, let {^fc} ! be given. In 
view of the monotonicity assumption we get s g (H(x), /ik) < ct and hence 

(x, a) £ epis/(-,/Xfc) Vfc £ N. 

With the choice x k := x, and otk '■— a it follows immediately that 

(x,a) £ Liminf epis/(-,/i), 

which concludes the proof. □ 

Corollary 5.3. //, in the setting of Section^ lo is 1-coercive with uj(0) < 0, then 
for s g (;fx) := g#Ufj, we have 

e— lims a (if (•), (j,) = g o H. 

fj-10 

Proof. The assertion follows immediately from Lemma l4.4l and Proposition [521 D 

In the following result we employ the limiting normal cone for a (nonempty) convex 
set C C R™ at x £ C, which is given by, cf. [531 Theorem 6.9], 

N(x | C) = {v el" | v T (x-x) < VxeC}. 

In our setting, C is the domain of an lsc, convex function g : R n — > R U {+oo}, 
which is closed and convex. 

Lemma 5.4. Let {gk} be a sequence of lsc, convex functions gk ■ R m — > RU{+oo} 
converging epi- graphically to g : R m — > R U {+cxd}. Furthermore, let {z k } be an 
unbounded sequence such that z k £ dgk(y k ) for all k £ N for some {y k } — > y £ 
domg. Then every accumulation point of | p-p | lies in N(y \ dom^). 

Proof. Let z be an accumulation point of W.l.g. we can assume that 

-pj — ¥ z. Moreover, let y £ domg be given. Since e— lim; £ _ ! . 00 gk = g, we may 

invoke @ to obtain a sequence {y k } — > y such that limsup fc _ > . 00 gk(y k ) < g(y)- 
Since, by assumption, z k £ dg(y k ) for all k £ N, we infer 

g k (y k )-gk(y k )>(z k ) T (y k -y k ) Vfc e N. 
Dividing by ||z fe || yields 

9k ^:d k{vk) > [ ^ k -y k )^nv-y). 
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To prove the assertion it suffices to see that the numerator of the left-hand side 
of the above inequality is bounded from above at least on a subsequence. This, 
however, is true due to the choice of {y k } and ©. □ 

A standard assumption in the context of convex composite functions, cf. j!6j . is the 
basic contstraint qualification which is formally stated in the following definition. 

Definition 5.5 (Basic constraint qualification). Let f be given as in (|13[) . Then f 
is said to satisfy the basic constraint qualification (BCQ) at a point x £ dom/ if 

N{H{x) | dom 3) C\mi\H'{x) T = {0}. 

Note that, in the setting of (IT3"|) . BCQ always holds at a point x £ dom / where 
H'(x) T has full column rank. Moreover, BCQ is always fulfilled when g is finite- 
valued, since then domg = M. m and thus, N(H(x) | domg) ~ {0} for all x £ R n . 
The BCQ is important since it guarantees a rich subdifferential calculus for the 
composition / = g o H. 

Lemma 5.6. |53| Theorem 10.6] Let f be given as in (|13j) . If BCQ is satisfied at 
x £ dom/ ; then f is (subdifferentially) regular at x and we have 

df{x) = H'{x) T dg{H{x)). 

Theorem 5.7. Let s g be an epi- smoothing function for g. If s/(-, •) s ff (iJ(-), •) 
is an epi- smoothing function for f := g o H , then 

Limsup V x Sf(x, /i) = df{x) 

fi 4,0,3:— > X 

for all x £ dom / at which the BCQ holds. 

Proof. We need only show that Limsup^o^^j V x Sf(H(x), /1) C df(x), since the 
Lim inf- inclusion is clear from Lemma 13.41 

To this end, let v £ Limsup M | ,a:^x ^xSf(H(x), /x) be given. Then there exist 
sequences {x k } — > x, and {fJ,k} I such that 

H'(x k ) T V x s g (H(x k )^ k ) = V x s/(z fe ,Mfc) -»• v. (14) 

Put z k := s g (H(x k ), jUfc) (k £ N). If {z k } were unbounded, then w.l.g. {iranr} 
z ^ 0, and we infer from (IT4)) that 

z £ nuli/'(S) T . 

On the other hand, Lemma \b. 41 tells us that z £ N(H(x) \ dom^), thus, 

^ z £ AT(i3"(£) I dom 5 ) nnuli?'(i) T , 

which contradicts BCQ. Hence, {z k } is bounded and converges at least on a subse- 
quence, and due to Attouch's theorem [53j Theorem 12.35] the limit (accumulation 
point) lies in dg(H(x)). Using this and the fact that H' is continuous, we get 

v £ H\x) T dg{H{x)) = df(x), 

where the equality is due to Lemma 15.61 This concludes the proof. □ 

Corollary 5.8. Let s g be an epi-smoothing function for g, and suppose u> is 1- 
coercive with w(0) < 0. Then s/(-, •) := s g (H(-), •) is an epi-smoothing function for 
f := g o H and 

Limsup V^s/ (x , [i) — df(x). 
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for all x G dom/ at which the BCQ holds. 

Proof. The result follows immediately from Corollary 15.31 and Theorem 15.71 □ 

We point out that, unlike in the convex case in Theorem 14.61 where we obtain the 
gradient consistency condition directly via Attouch's theorem, we cannot derive 
it in this case from a generalized version of Attouch's theorem for convex com- 
posite functions as it is presented in j50j Theorem 2.1], since we do not meet the 
assumptions there. 

6 Constrained Optimization 

We now apply the results of the previous section the constrained optimization 
problem 

minimize 4>(x) ,^g, 
subject to h{x) 6 C, 

where (f> : R n — ► R and h : M™ — > M m are smooth mappings and C C R m is a 
nonempty closed convex set. This is an example of a convex composite optimization 
problem 1121 116) where the composite function / = g o H is given by 

uoz)l 



7(7,1/) :=7 + % I C) and H(x) := 



h(x) 



In this case, g is the sum of a smooth convex function, 31(7,2/) := 7, and a non- 
smooth convex function 52(7,2/) •= S(y \ C). Hence, by Proposition 13. H we can 
obtain an epi-smoothing function for g by only smoothing the gi term. A straight- 
forward computation shows that 

1 2 
e-v92{y) = ^ dist [V I C )- 

Therefore, by Corollary 15. 31 

s f (x,n) = <j){x) + ^dist 2 (h(x) I C) (16) 

is an epi-smoothing function for /. This is one of the classical smoothing functions 
for constrained optimization [33] . The BCQ becomes the condition 

nul h'{x) T n N(h(x) I C) = {0}. (17) 

In the case where C = {0} s x R™ _s , the function (fT6)) is the classical least- 
squares smoothing function for nonlinear programming, and (|17[) reduces to the 
Mangasarian-Fromovitz constraint qualification (e.g., see |53[ Example 6.40]). 
Corollary 15.81 tells us that at every point x with h(x) £ C we have 

LimsupV K s/(a;,/i) = V (j)(x) + ti {x) T N (h(x) | C), 



whenever condition (fl7j) holds at x, where, by Proposition 14.71 

'h(x)~U c (h(x)) 



V*s f (x,n) = V<t)(x) + ti(xY 

V A* 

The results of Section [5] allow us to make powerful statements about algorithms 
that use the epi-smoothing function (|16[) to solve the optimization problem (fT5|) . 
We begin by studying the case of cluster points that are feasible for (fT5|) . 
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Theorem 6.1. Let Sf be as in (|16|) with <p, h, and C satisfying the hypotheses 
specified in (1151) . Let {x k } C R" and {^k} -I satisfy || V x s/(x fc , fik)\\ -I- 0. Then 
every feasible cluster point x of {x k } at which (| 1 T[) is satisfied, is a Karush-Kuhn- 
Tucker point for (|15j) . i.e., 

6 df(x) = Vcf>(x) + h'(x) T N (h(x) | C) . 

Proof. Lemma [5.61 implies that df(x) — \7<fi(x) + h'(x) T N (h(x) \ C). Hence, by 
Corollary x is a KKT point for (JT5J) . □ 



Theorem 16.11 tells us that the feasible cluster points of sequences of approximate 
stationary points of Sf are KKT points, but, from and algorithmic perspective, 
this does not give us a mechanism for testing proximity to optimality via standard 
optimality conditions. That is, it does not show how to approximate the multiplier 
vector. This is addressed by the following corollary. 

Corollary 6.2. Let Sf, <fi, h, C, {x k }, and {fJ-k} be as in Theorem \6.1\ and let x 

be a cluster point of {x k } at which h(x) G C and (|17|) is satisfied. If J C N is a 
subsequence for which x k — >j x, then the associated subsequence {y k }j, where 

remains bounded and every cluster point y is such that (x, y) is a Karush-Kuhn- 
Tucker pair for (I15|) . i.e., 

= V4>(x) + h\x) T y with y £ N(h(x) \ C). 

Proof. Let J C N and x be as in the statement of the corollary. Theorem 16.11 tells 
us that x is a KKT point for flU]), i.e., G df(x) = V<f>(x) + h'{x) T N{h{x) \ C). 
We first show that the subsequence {y k }j given above is necessarily bounded. 

Suppose, to the contrary, that the sequence is not bounded. Then there is a 
further subsequence J C J such that ||y fc || fj +oo. With no loss in generality we 
may assume that there is a unit vector y such that y k / ||j/ fc || — >j y. Since y k G 
N (lic{h{x k )) | C) for all k, the outer semicontinuity of the normal cone operator 
zh>JV(z | C) relative to C, cf. [53j Proposition 6.6], implies that y G N (h(x) \C). 



Dividing || V x s f (x k , /i fc )|| by \\y k 



and taking the limit over J gives hf(x) y = 



But this contradicts the BCQ (|17p since y is a unit vector. Therefore, the sequence 
{y k }j is bounded. 

Let y be any cluster point of the sequence {y k }j (at least one such cluster point 
must exist since this sequence is bounded). As above, y G N (h(x) | C), and by the 
hypotheses, = V</>(x) + h'(x) T y. Hence, x is a KKT point for (f!5|) and y is an 
associated KKT multiplier. □ 

We now address the case of infeasible cluster points, i.e., cluster points x for which 
h(x) C. To understand this case, we must first review the subdifferential prop- 
erties of the distance function dist(- | C) and the associated convex composite 
function 

V>(x) := dist(>0) | C). 
First, recall from [T3J Proposition 3.1] that 

oaisi(y | o j - < N (y \ C + dist(y | C)B ) n bdry(B) if y £ C, 1 J 
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where bdry(B) is the boundary of the unit ball, and, by [53l Example 8.53], we also 
have 

ddist(y | C) = N (y | C + dist(y | C)B) D bdry(B) = j^-^M J Vy £ C. 

(19) 

In addition, from Q21 Equation 2.4], ip is subdifFerentially regular on R™ with 

dip(x) = ti (x) T ddist(h(x) | C). (20) 
These formulas yield the following result. 

Theorem 6.3. Let Sf, <fi, h, C, {x k }, and {fik} be as in Theorem \6.1[ and let x 

be a cluster point of {x k } at which h(x) £ C. Then £ dtp(x). 

Proof. Let J C N be such that x k — >j x. Since || V x Sf(x k 7 Hk)\\ I 0, we have 
£tfc || V x Sf(x k , Hk)\\ I 0, and consequently 

h'(x k ) T (h(x k ) - n c (h(x k ))) ^ 0. 

Hence, by the continuity of He and (|T9"|) . € dip(x). □ 



Theorem 16.31 shows that any algorithm that drives V a s/(x fe , /!&) to zero as fik i 
performs admirably even when the problem (|15p is itself infeasible. That is, in the 
absence of feasibility, it naturally tries to locate a nonfeasible stationary point for 
(|T5|) as defined in [T3] . It may happen that the original problem is feasible while all 
cluster points are nonfeasible stationary points. This can be rectified by placing a 
further restriction on how the iterates {x k } are generated. 

Proposition 6.4. Let C, (f>, h, and Sf be as in (|15l) and (|16l) . and let //& I 0. 

Suppose that there is a known feasible point x for (|15p. If {x k } is a sequence for 
which Sf(x k , fik) < Sf(x, fik) = 4>{x) for all k — 1,2, ... , then every cluster point of 
{x k } must be feasible for (|15[) . 

Proof. Let i be a cluster point of {x k } and let J C N be such that x k — >j x. If 
x is not feasible, then ^-dist 2 (h(x k ) \ C) — >j +oo. But Sf(x k ,fi k ) — 4>{x k ) + 
2^dist 2 (/i(a; fc ) | C) < 4>{x) giving the contradiction 4>{x k ) — >j — oo. □ 

In fact, without further hypotheses, feasibility might not be attained in the limit 
even in the prototypical example of convex composite optimization, the Gauss- 
Newton method for solving nonlinear systems of equations. It is often the case that 
the additional hypotheses employed are related to the BCQ (IT71) . One way to un- 
derstand the role of nonfeasible stationary points and their effect on computation is 
through constraint qualifications that apply to nonfeasible points. These constraint 
qualifications extend (|17|) to points on the whole space. Among the many possible 
extensions one might consider, we use one from the geometry of the subdifferential 
in (|18[) . We say that the extended constraint qualification (ECQ) for (|15|) is satisfied 
if 

mil h! (xf DN(h(x) | C + dist(/i(x) | C)B) = {0}. (21) 
Note that this condition is well defined on all of ffi™ and reduces to (JT7J) when 
h(x) e C. When h(x) £ C, it is easily seen that G dil>(x) if and only if (|2"Tj) 
is not satisfied. Hence, if one assumes that ECQ is satisfied at all iterates, then 
nonfeasible cluster points cannot exist. For example, if C = {0}, then a standard 
global constraint qualification is to assume that h'(x) is everywhere surjective, i.e., 
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milh'(x) T = {0} for all x. This implies ([21]) which simply says that h'(x) T h(x) ^ 
whenever h(x) ^ and h'(x) is surjective whenever h(x) = 0. 

7 Final Remarks 

In this paper we have synthesized the infimal convolution smoothing ideas pro- 
posed by Beck and Teboulle in [7] with the notion of gradient consistency defined 
by Chen in |21j . To achieve this we make use of epi- convergence techniques that 
are well suited to the study of the variational properties of parametrized families of 
functions. Using epi-convergence, we defined the notion of epi-smoothing for which 
we established a rudimentary calculus. Epi-smoothing is a weakening of the kinds of 
smoothing studied in [7] where the focus is on convex optimization and the deriva- 
tion of complexity results which necessitate stronger forms of smoothing. We then 
applied the epi-smoothing ideas to study the epi-smoothing properties of convex 
composite functions, a very broad and important class of nonconvex functions. In 
particular, we showed that general constrained optimization falls within this class. 
Using the epi-smoothing calculus, we easily derived the convergence properties of a 
classical smoothing approach to constrained optimization establishing the conver- 
gence properties even in the case when the underlying optimization problem is not 
feasible. This application demonstrates the power of these ideas as well as their 
ease of use. 
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